AITopics

2410.21595

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

arXiv.org Machine LearningOct-3-2024

Ranking Perspective for Tree-based Methods with Applications to Symbolic Feature Selection

Luo, Hengrui, Li, Meng

Tree-based methods are powerful nonparametric techniques in statistics and machine learning. However, their effectiveness, particularly in finite-sample settings, is not fully understood. Recent applications have revealed their surprising ability to distinguish transformations (which we call symbolic feature selection) that remain obscure under current theoretical understanding. This work provides a finite-sample analysis of tree-based methods from a ranking perspective. We link oracle partitions in tree methods to response rankings at local splits, offering new insights into their finite-sample behavior in regression and feature selection tasks. Building on this local ranking perspective, we extend our analysis in two ways: (i) We examine the global ranking performance of individual trees and ensembles, including Classification and Regression Trees (CART) and Bayesian Additive Regression Trees (BART), providing finite-sample oracle bounds, ranking consistency, and posterior contraction results. (ii) Inspired by the ranking perspective, we propose concordant divergence statistics $\mathcal{T}_0$ to evaluate symbolic feature mappings and establish their properties. Numerical experiments demonstrate the competitive performance of these statistics in symbolic feature selection tasks compared to existing methods.

partition, ranking perspective, selection, (14 more...)

2410.02623

Country:

North America > United States > Texas > Harris County > Houston (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.04)

Genre: Research Report (1.00)

Industry:

Government > Regional Government (0.45)
Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Huisman, Tim, van der Linden, Jacobus G. M., Demirović, Emir

Optimal Survival Trees: A Dynamic Programming Approach

arXiv.org Artificial IntelligenceJan-9-2024

Survival analysis studies and predicts the time of death, or other singular unrepeated events, based on historical data, while the true time of death for some instances is unknown. Survival trees enable the discovery of complex nonlinear relations in a compact human comprehensible model, by recursively splitting the population and predicting a distinct survival distribution in each leaf node. We use dynamic programming to provide the first survival tree method with optimality guarantees, enabling the assessment of the optimality gap of heuristics. We improve the scalability of our method through a special algorithm for computing trees up to depth two. The experiments show that our method's run time even outperforms some heuristics for realistic cases while obtaining similar out-of-sample performance with the state-of-the-art.

decision tree, leaf node, surtree, (14 more...)

2401.04489

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

arXiv.org Artificial IntelligenceSep-21-2022

Tree Methods for Hierarchical Classification in Parallel

Heinsen, Franz A.

We propose methods that enable efficient hierarchical classification in parallel. Our methods transform a batch of classification scores and labels, corresponding to given nodes in a semantic tree, to scores and labels corresponding to all nodes in the ancestral paths going down the tree to every given node, relying only on tensor operations that execute efficiently on hardware accelerators. We implement our methods and test them on current hardware accelerators with a tree incorporating all English-language synsets in WordNet 3.0, spanning 117,659 classes in 20 levels of depth. We transform batches of scores and labels to their respective ancestral paths, incurring negligible computation and consuming only a fixed 0.04GB of memory over the footprint of data.

ancestral path, machine learning, natural language, (17 more...)

2209.10288

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Zhu, Yi, Filipov, Evgueni T.

Harnessing Interpretable Machine Learning for Holistic Inverse Design of Origami

arXiv.org Artificial IntelligenceJul-18-2022

This work harnesses interpretable machine learning methods to address the challenging inverse design problem of origami-inspired systems. We show that a decision tree-random forest method is particularly suitable for fitting origami databases, containing both design features and functional performance, to generate human-understandable decision rules for the inverse design of functional origami. First, the tree method is unique because it can handle complex interactions between categorical features and continuous features, allowing it to compare different origami patterns for a design. Second, this interpretable method can tackle multi-objective problems for designing functional origami with multiple and multi-physical performance targets. Finally, the method can extend existing shape-fitting algorithms for origami to consider non-geometrical performance. The proposed framework enables holistic inverse design of origami, considering both shape and function, to build novel reconfigurable structures for various applications such as metamaterials, deployable structures, soft robots, biomedical devices, and many more.

artificial intelligence, machine learning, origami, (18 more...)

doi: 10.1038/s41598-022-23875-6

2204.07235

Country:

North America > United States > Michigan (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Germany > Hamburg (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Min, Joosung, Elliott, Lloyd T.

Q-learning with online random forests

arXiv.org Machine LearningApr-7-2022

$Q$-learning is the most fundamental model-free reinforcement learning algorithm. Deployment of $Q$-learning requires approximation of the state-action value function (also known as the $Q$-function). In this work, we provide online random forests as $Q$-function approximators and propose a novel method wherein the random forest is grown as learning proceeds (through expanding forests). We demonstrate improved performance of our methods over state-of-the-art Deep $Q$-Networks in two OpenAI gyms (`blackjack' and `inverted pendulum') but not in the `lunar lander' gym. We suspect that the resilience to overfitting enjoyed by random forests recommends our method for common tasks that do not require a strong representation of the problem domain. We show that expanding forests (in which the number of trees increases as data comes in) improve performance, suggesting that expanding forests are viable for other applications of online random forests beyond the reinforcement learning setting.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2204.03771

Country:

North America > United States > Massachusetts (0.04)
North America > Canada > British Columbia (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Machine LearningNov-4-2019

Variable Grouping Based Bayesian Additive Regression Tree

Su, Yuhao, Ding, Jie

Using ensemble methods for regression has been a large success in obtaining high-accuracy prediction. Examples are Bagging, Random forest, Boosting, BART (Bayesian additive regression tree), and their variants. In this paper, we propose a new perspective named variable grouping to enhance the predictive performance. The main idea is to seek for potential grouping of variables in such way that there is no nonlinear interaction term between variables of different groups. Given a sum-of-learner model, each learner will only be responsible for one group of variables, which would be more efficient in modeling nonlinear interactions. We propose a two-stage method named variable grouping based Bayesian additive regression tree (GBART) with a well-developed python package gbart available. The first stage is to search for potential interactions and an appropriate grouping of variables. The second stage is to build a final model based on the discovered groups. Experiments on synthetic and real data show that the proposed method can perform significantly better than classical approaches.

bart, dataset, random forest, (13 more...)

1911.00922

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

#artificialintelligenceOct-29-2019, 01:22:48 GMT

Conversational Sentiment Analysis

I recently built a movie recommender that takes as input a user written passage about liked and/or disliked movies. At the onset of the project I figured that determining which movies users' liked and disliked would be simple. After all, using text to determine whether someone likes or dislike a movie doesn't seem too ambitious. With the variety of packages readily available for sentiment analysis in python, there had to be something available out of the box to do this job. As it turns out, using text to determine whether someone likes vs dislikes a movie, or any named entity, is deceivingly complex.

dependency tree, sentiment, sentiment analysis, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.65)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.65)

arXiv.org Machine LearningMay-8-2018

Several Tunable GMM Kernels

Li, Ping

While tree methods have been popular in practice, researchers and practitioners are also looking for simple algorithms which can reach similar accuracy of trees. In 2010, (Ping Li UAI'10) developed the method of "abc-robust-logitboost" and compared it with other supervised learning methods on datasets used by the deep learning literature. In this study, we propose a series of "tunable GMM kernels" which are simple and perform largely comparably to tree methods on the same datasets. Note that "abc-robust-logitboost" substantially improved the original "GDBT" in that (a) it developed a tree-split formula based on second-order information of the derivatives of the loss function; (b) it developed a new set of derivatives for multi-class classification formulation. In the prior study in 2017, the "generalized min-max" (GMM) kernel was shown to have good performance compared to the "radial-basis function" (RBF) kernel. However, as demonstrated in this paper, the original GMM kernel is often not as competitive as tree methods on the datasets used in the deep learning literature. Since the original GMM kernel has no parameters, we propose tunable GMM kernels by adding tuning parameters in various ways. Three basic (i.e., with only one parameter) GMM kernels are the "$e$GMM kernel", "$p$GMM kernel", and "$\gamma$GMM kernel", respectively. Extensive experiments show that they are able to produce good results for a large number of classification tasks. Furthermore, the basic kernels can be combined to boost the performance.

artificial intelligence, deep learning, machine learning, (17 more...)

1805.0283

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

@machinelearnbotJul-1-2017, 13:25:04 GMT

Making data science accessible - Machine Learning – Tree Methods

Tree methods are commonly used in data science to understand patterns within data and to build predictive models. The term Tree Methods covers a variety of techniques with different levels of complexity but my aim is to highlight three I find useful. To set the problem up let's assume we have a census dataset containing age, education, employment status and so on. Given all this information we want to see if we can predict whether a person earns more than $50k per year. How can tree methods help us?

artificial intelligence, machine learning, tree method, (15 more...)

@machinelearnbot

Industry: Education (0.57)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)